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An Introduction to Microcomputers, vol. II, Some Real Products, authored and published 
by Adam Osborne and Associates, Inc., 1977, pp. 6-18, 6-19. 

ART-UNIT: 237 

PRIMARY -EXAMINER: Zache; Raulfe B. 
ATTY-AGENT-FIRM: Barbee; Joe E. 

ABSTRACT : 

A microprocessor having separate bidirectional instruction and data busses is 
disclosed which allows the fetching of instructions from a program memory to be 
overlapped with the execution of instructions previously fetched. Program instructions 
are stored in an internal read-only-memory and/or in an external read-only-memory. 
Variable data is stored in an internal register array. During a given machine cycle, a 
data word in the register array can be transferred to an arithmetic-logic unit by a 
bidirectional data bus. The result of the operation performed by the arithmetic -logic 
unit can be transferred by the data bus back to the register array and stored in the 
selected location during the same machine cycle. Simultaneously, the contents of a 
program counter are transferred by a bidirectional instruction memory bus to the 
program memory to access the instruction to be executed on the following machine 
cycle. The addressed instruction is transferred from the program memory by the 
bidirectional instruction memory bus to the microprocessor and is stored to be decoded 
and executed on the following machine cycle. 

7 Claims, 25 Drawing figures 
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DOCUMENT- IDENTIFIER: US 4255785 A 

TITLE: Microprocessor having instruction fetch and execution overlap 



CLAIMS : 



1. An integrated circuit data processor capable of executing in an overlapping manner 
a plurality of macroinstructions stored in a memory in accordance with a plurality of 
machine cycles, the data processor also being capable of reading data operands from 
the memory containing the plurality of instructions, comprising: 

(a) an address register for storing a first address of a first instruction during a 
first machine cycle and for storing a second address of a second instruction during a 
second machine cycle, 

(b) an instruction register for storing the first instruction during the second 
machine cycle and for storing the second instruction during a third machine cycle, 

(c) a memory bus coupled to said address register and to said instruction register for 
transferring the first and second addresses from said address register to the memory, 
said memory bus also being for transferring the first and second instructions from the 
memory to said instruction register, 

(d) timing means coupled to said address register and to said instruction register for 
effecting the first, second and third machine cycles, 

(e) means coupled to said instruction register and to said timing means for executing 
the first and second instructions, said means effecting execution of the first 
instruction during the second machine cycle, 

(f) a data operand address register having an output coupled to said memory bus for 
providing to the memory bus the address of a data operand stored in the memory, 

(g) means for storing the data operand addressed by said data operand address 
register, said storing means having an input coupled to said memory bus for receiving 
from the memory bus the data operand from the memory, and 

(h) a read-only memory coupled to said address register and to said instruction 
register for storing a plurality of instructions for determining a sequence of 
operations to be performed by said data processor. 
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ART-UNIT: 266 

PRIMARY -EXAMINER: Boudreau; Leo 
ASSISTANT-EXAMINER: Kelley; Chris 

ATT Y- AGENT -FIRM: Townsend and Townsend and Crew LLP 



ABSTRACT : 

The present invention provides an image compression/decompression coprocessor which is 
integrated on a single chip. The control bus has a control unit which is connected by 
an internal, global bus to a number of different, special purpose processing units. 
Each of the processing units is specifically designed to handle only certain steps in 
compression and decompression processes. 

37 Claims, 34 Drawing figures 
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L2 : Entry 1 of 4 



File: USPT 



Dec 16, 1997 



DOCUMENT- IDENTIFIER: US 5699460 A 

TITLE: Image compression coprocessor with data flow control and multiple processing 
units 



Detailed Description Text (124) : 

When Id. sub.-- res. sub.-- pac is asserted and no. sub.-- update is not, Instruction 
Update Block 1123 loads its "update counter" with the instruction's "number of 
destinations" field (ND) from the result packet, and starts the update state machine. 
The process of modifying Operand Memory 821 requires three clock cycles per 
destination, and each destination is processed in turn. During the first clock cycle, 
Operand Memory 821 is read at the location selected by the 7 bit "instruction address" 
portion of the appropriate destination field within the update register, and the 
fetched 21 bit word is stored in three registers; the 5 bit semaphore field is stored 
in a "semaphore register", while each of the 8 bit operand address fields is stored in 
an " operand address register ". During the second clock cycle, the most significant bit 
of the operand address register selected by the 1 bit "operand select" portion of the 
appropriate update register destination field is set to "1" to indicate "operand 
present"; the least significant 7 bits of this same register are loaded with the 
"result address" field from the update register. The semaphore register is loaded with 
the most significant 5 bits from the update register. During the final clock cycle, 
the contents of the semaphore register and each of the two operand address registers 
are written back to Operand Memory 821 at the same location they were read from. The 
update counter is decremented by "1" each time a destination is processed; when this 
counter is zero, Instruction Update Block 1123 is deactivated, and Main Controller 
Block 1121 restarts Instruction Enable Block 1113. 



1 of 1 



11/25/03 2:51 AM 



Record Display Form 



http://westbre:8002^in/gatc.exe?f=dpc&s...sagc=&p doccnt=l&p_doc_l=PTFFRQ&p doc 2= 



□ Generate Collection Print 



L15: Entry 28 of 35 



File: USPT 



Feb 19, 1980 



US-PAT-NO: 4189768 

DOCUMENT- IDENTIFIER: US 4189768 A 
TITLE: Operand fetch control improvement 
DATE- ISSUED: February 19, 1980 



INVENTOR- INFORMATION : 
NAME 

Lip t ay; John S. 
Rymarczyk; James W. 



CITY STATE 
Rhinebeck NY 
Poughkeep s i e NY 



ZIP CODE 



COUNTRY 



ASS IGNEE - INFORMATION : 
NAME 



CITY STATE ZIP CODE COUNTRY TYPE CODE 



International Business Machines Corporation Armonk NY 

APPL-NO: 05/ 887091 [PALM] 
DATE FILED: March 16, 1978 

INT-CL: [02] G06F 1/00 

US-CL-ISSUED: 364/200 

US-CL -CURRENT: 712/204; 712/210 

FIELD -OF -SEARCH : 364/2MSFile 

PRIOR-ART-DISCLOSED : 

U.S. PATENT DOCUMENTS 



02 



Search Selected 



Search ALL 





PAT -NO 


IS SUE -DATE 


PATENTEE -NAME 


US-CL 


□ 


3496550 


February 1970 


Schachner 


364/200 


□ 


3569938 


March 1971 


Eden et al . 


364/200 


□ 


3588829 


June 1971 


Boland et al . 


364/200 


□ 


3806888 


April 1974 


Brickman et al . 


364/200 


□ 


3896419 


June 1975 


Lange et al . 


364/200 



ART-UNIT: 237 

PRIMARY-EXAMINER: Springborn; Harvy E. 
ATTY- AGENT -FIRM: Goldman; Bernard M. 

ABSTRACT : 
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Operand controls are provided in an I -unit using address operand pairs (AOPs) , each 
pair consisting of a request register and a buffer register. When handling variable 
field length (VFL) instructions with source (SRC) and destination (DST) operand 
addresses, two AOPs are generally assigned to receive different parts of the first 
subline (e.g. doubleword) of the SRC operand; this is called a duplicate fetch and is 
used with any size VFL operand. Efficiency is improved for the special case in which 
the DST operand has all of its bytes confined to a single subline in main storage by 
detecting the special case and inhibiting a duplicate fetch signal to the I-unit 
controls which assign duplicate AOPs to an instruction. The SRC operand may have more 
than one subline but the alignment controls force all source operand bytes into a 
single subline for the special case. When the duplicate fetch signal is suppressed, 
only one AOP is assigned by the controls to the first subline fetch for the SRC 
operand. 

3 Claims, 53 Drawing figures 
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Feb 19, 1980 



DOCUMENT- IDENTIFIER: US 4189768 A 
TITLE: Operand fetch control improvement 



Detailed Description Text (64) : 

Fig. 2 shows the principle elements making up the operand fetching logic. There are 
six address register/operand buffer pairs AOP-A through AOP-F, each having a 
respective one of operand address registers OAR-A through OAR-F and a respective one 
of operand buffers OP-A through OP-F. The illustrated register pair AOP-A in FIG. 2 
shows the operand address register OAR-A associated with the buffer register OP-A, and 
their controls. The same arrangement is found in each other register pairs AOP-B 
through AOP-F. A single pair AOP is capable of addressing and receiving a fetch of a 
doubleword of eight bytes from storage and buffering the received doubleword until its 
data is needed by E FCT block A15. For each instruction, enough of the six AOP's are 
assigned so that an entire operand field can be fetched and made available to the 
execution function as needed. 
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The described embodiment provides storage control (PSCF) for overlapping the handling 
of processor store requests between their generation by an instruction execution means 
(IPPF) and their presentation to system main storage (MS) . 

The embodiment uses a store counter, an inpointer counter, an outpointer counter, a 
translator pointer register, an output counter and a plurality of registers sets to 
process and control the sequencing of all store requests so that the PSCF can output 
them to MS in the order received from the IPPF. The embodiment uses the counters to 
coordinate the varying delays in PSCF processing of plural store request contained in 
different register sets and the translator. 

The store counter obtains independence between plural IPPF operand address (OA) 
registers which send the store requests and plural PSCF register sets which handle the 
store request. The number of OA registers is made independent of the number of 
register sets. The store counter is also used for serializing instruction control. 

36 Claims, 15 Drawing figures 
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File: USPT 



Apr 10, 1979 



DOCUMENT- IDENTIFIER: US 414 924 5 A 

TITLE: High speed store request processing control 



Detailed Description Text (2) : 

FIGS. 1A-1F illustrate pertinent parts of the circuitry in the processor storage 
control function (PSCF) containing the subject invention. FIG. 1G illustrates the 
pertinent circuitry in the instruction preprocessing function (IPPF) interfacing the 
PSCF circuitry in FIGS. 1A-1F. The store and fetch requests are provided from the IPPF 
in FIG. 1G on line 10 to line 10 in the PSCF in FIG. 1A. FIG. 1G illustrates the 
derivation of store and fetch requests from operand address registers OA1 through OAn 
in the IPPF. Fetch requests are also provided from the instruction fetch control in 
the IPPF. Each request is presented to PSCF priority circuit 11 in FIG. 1A where it 
may contend for PSCF bus priority with other storage requests being delayed in the 
PSCF previously supplied by bus 10 from the IPPF. Contending requests have their 
processing completed, or partly completed, in the PSCF and are being held by any or 
more of the four redo registers 16-1, 16-2, 16-3, or 16-4 shown in FIG. 1C or ID, or 
in the translator (XL) pointer triggers 14A shown in FIG. 1A. A redo request signal is 
supplied on line 12, and a translator request signal is supplied on line 13 to PSCF 
priority circuit 11. Circuit 11 decides which received request is to be put on the 
PSCF bus by giving highest priority to a translator (XL) request on line 13, next 
higher priority to a redo store request line 12, and lowest priority to a new store or 
fetch request from the IPPF on line 10. The IPPF sends only one request at one time. 
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Data format converting apparatus is described for simultaneously converting multiple 
bytes of zoned decimal data to packed decimal data or vice versa. In the preferred 
embodiment, this format converting apparatus is obtained by adding a minimum amount of 
additional circuitry to a multibyte flow-through type data shifter used for providing 
the normal data shifting operations in a digital data processor. In particular, a 
zoned-decimal -to-packed-decimal conversion capability is provided by combining 
additional switching logic with the normal shifter switching logic for enabling the 
conductors for nonadjacent data fields on the shifter input data bus to be coupled to 
the conductors for adjacent data fields on the shifter output data bus. A 
packed-decimal -to- zoned-decimal conversion capability is provided by adding further 
switching logic for enabling the conductors for adjacent data fields on the shifter 
input data bus to be coupled to the conductors for nonadjacent data fields on the 
shifter output data bus. Control circuitry is provided for selectively enabling either 
normal data shifting operations or zoned- to-packed format conversion operations or 
packed-to-zoned format conversion operations. The shifting and format converting 
hardware is organized so that implementation in the form of large-scale integration 
circuitry can be accomplished with a minimum number of integrated circuit chips and a 
minimum number of chip input/output connections per chip. 

18 Claims, 41 Drawing figures 
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DOCUMENT- IDENTIFIER: US 4141005 A 

TITLE: Data format converting apparatus for use in a digital data processor 



Detailed Description Text (7) : 

Briefly considering in a general way the procedure for a more or less typical machine 
language program instruction, the first step is to fetch the instruction from the main 
storage 17 and to set it into an instruction register 26. This is accomplished by 
reading the next instruction address from the instruction counter in local store 25 
and setting such address into a storage address register (SAR) 2 7 for the main storage 
17. Such address is supplied to SAR 27 by way of a B register 28 and an assembler 29. 
The addressed instruction is read from main storage 17 and supplied to the instruction 
register 26 by way of data buses 18 and 13, byte shifter and format converter 14, data 
buses 15 and 30, destination (D) register 31 and data buses 32 and 33. As part of the 
instruction fetching operations, the operand addresses are calculated from the base 
and displacement values contained in the instruction and such results are set into 
appropriate operand address registers in local store 25. Also, the instruction counter 
in local store 2 5 is updated so as to contain the address of the next machine 
instruction . 
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OTHER PUBLICATIONS 

James L. Turley, "Advanced 80386 Programming Techniques , " Osborne McGraw-Hill, 1988, 
pp. 45-84 and pp. 283-315. 

Intel, : Pentium Processor Family Developer's Manual, vol. 3: Architecture and 
Programming Manual, 1995, pp. 3-1 to 24, 10-1 to 10-13, 11-1 to 11-25. 
Pentium. RTM. Pro Family Developer's Manual, vol. 3: Operating System Writer's Guide, 
.COPYRGT. Intel Corporation 1996, Chapters 2-4, pp. 2-1 through 4-29. 

Intel Architecture Software Developer's Manual, vol. 1: Basic Architecture, .COPYRGT. 
Intel Corporation 1996, 1997, pp. 3-1 through 3-15. 

ART-UNIT : 2183 

PRIMARY-EXAMINER: Pan; Daniel H. 

ATT Y- AGENT -FIRM: Merkel; Lawrence J. Meyertons , Hood, Kivlin, Kowert & Goetzel, P.C. 
ABSTRACT : 

A processor supports a processing mode in which the default address size is greater 
than 32 bits and the default operand size is 32 bits. The default address size may be 
nominally indicated as 64 bits, although various embodiments of the processor may 
implement any address size which exceeds 32 bits, up to and including 64 bits, in the 
processing mode. The processing mode may be established by placing an enable 
indication in a control register into an enabled state and by setting a first 
operating mode indication and a second operating mode indication in a segment 
descriptor to predefined states. Additionally, an instruction prefix may be coded into 
an instruction to override the default address and/or operand size. Thus, an address 
size of 32 bits may be used when desired, and an operand size of 64 bits may be used 
when desired. 
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ATT Y- AGENT -FIRM: Christie, Parker & Hale 
ABSTRACT : 

A data processor in which the address fields within the instructions may be of two 
different lengths in terms of the number of address digits in the field. The number of 
digits in the address field is determined by the digit in the most significant digit 
position of the address. If the most significant digit is coded to be a special 
character, the next six digits are used as the address. If the most significant digit 
is not coded to be the special character but a decimal digit, it is used together with 
the next four digits as the address. 
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ABSTRACT : 

Apparatus for scaling addresses received by a memory module in a modular 
requestor-memory system in which standard memory modules may be of a discretely 
variable size and utilized in a plurality of positions in an overall contiguous memory 
addressing scheme. In particular, this scaling apparatus enables a modular memory 
which is only partially populated, i.e., only able to respond to a subset of the set 
of all addresses available, to be located in any one of several positions representing 
different addressing ranges. This is accomplished without modification of the memory 
module itself. The memory module knows its discrete capacity, or size, by virtue of 
the population of the memory array storage locations (array cards) contained therein. 
The memory module then uses this information to scale, or strip off, the appropriate 
number of bits from the gross address to allow addressing of the restricted number of 
memory locations present in the memory module. 
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ART-UNIT: 232 

PRIMARY-EXAMINER: Bowler; Alyssa H. 
ATTY-AGENT-FIRM: Sowell; John B. Starr; Mark T. 

ABSTRACT : 

The present invention provides a novel multi-mode DRAM controller adaptd to access 
DRAM chips of a main storage unit of different size and of different mode types. The 
novel DRAM controller comprises new address generation and control logic for delaying 
the RAS and CAS control signals to memory and for expanding the number of address bits 
employed to address memory chips having a greater number of addresses by at least one 
address bit. 
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OTHER PUBLICATIONS 
Shiva, "Computer Design and Architecture " , 1985, p. 336. 
ART-UNIT: 237 

PRIMARY- EXAMINER: Richardson; Robert L. 

ASSISTANT -EXAMINER: Barry; Lance Leonard 

ATT Y- AGENT -FIRM : Blakely Sokoloff Taylor & Zafman 

ABSTRACT : 

A method and apparatus are provided for enabling a computer that is capable of running 
programs utilizing different address sizes to run those programs without having to 
modify the computer's hardware. A mask register is used to identify bits of a sum of 
register addresses that are valid for the program that is running. The number of valid 
bits in the register mask can be changed to correspond to the addressable memory size 
for different programs. 
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OTHER PUBLICATIONS 
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PRIMARY -EXAMINER: Dixon; Joseph L. 
ASSISTANT -EXAMINER : Nguyen; Hiep T. 

ATTY- AGENT -FIRM: Marshall, Jr.; Robert D. Kesterson; James C. Donaldson; Richard L. 
ABSTRACT : 

There is disclosed a system and method for operating a memory controller in a manner 
which will allow memories with differing address sizes to be connected to a common 
bus. The controller decodes address information to change from a determined default 
address size to another address size on a dynamic basis during the actual memory 
access cycle. Upon detection of larger memory size, an adjustment occurs in the 
presentation of address information on the common address bus. 
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ABSTRACT: 

A microcontroller that provides an environment to run processes developed to run on 
several prior or low end generation machines with the independent register, status and 
data space needed for execution, that is, the resources of the microcontroller are a 
superset of the resources of the prior generation machine. The ability to limit one 
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data space segmentation controlled by upper order address bits not accessible by the 
independent processes. The separate workspaces are configured substantially like a 
workspace of a prior or low end generation machine allowing the microcontroller to 
perform the tasks of several independent prior or low end generation machines working 
in concert . 
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D.R. Ditzel, et al. "The hardware Architecture of the Crisp Microprocessor", The 14th 
annual International Symposium on Computer Architecture, Jun. 1987, pp. 309-319. 

ART-UNIT: 273 

PR I MARY -EXAMINER: Pan; Daniel H. 

ATT Y- AGENT -FIRM: McDermott, Will & Emery 



A data processor according to the present invention executes instructions described in 
first and second instruction formats. The first instruction format defines a 
register-addressing field of a predetermined size, while the second instruction format 
defines a register-addressing field of a size larger than that of the 

register-addressing field defined by the first instruction format. The data processor 
includes: instruction-type identifier, responsive to an instruction, for identifying 
the received instruction as being described in the first or second instruction format 
by the instruction itself; a first register file including a plurality of registers; 
and a second register file also including a plurality of registers, the number of the 
registers included in the second register file being larger than that of the registers 
included in the first register file. If the instruction-type identifier has identified 
the received instruction as being described in the first instruction format, the data 
processor executes the instruction using data held in the first register file. On the 
other hand, if the instruction- type identifier has identified the received instruction 
as being described in the second instruction format, the data processor executes the 
instruction using data held in the second register file. 

26 Claims, 25 Drawing figures 
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CROSS REFERENCE: 0018 -8689-24 -11A-5410 
DISCLOSURE TEXT: 

5p. High performance processors like the IBM System/370 Model 3033 use a hardwired 
instruction preprocessing function unit (IPPE) and microcoded execution (E) unit. An 
ideal instruction goes through the pipelined lined machine in four cycles as follows: 
In the first cycle (D/A) , the instruction is decoded to determine the needs in terms 
of general purpose registers (GPRs) and operands from cache, and information relevant 
to its execution activity is placed in a four-position queue between the IPPF and the 
E unit. In the next cycle (CI) , the operand access for the instruction if any is 
begun. In the third cycle (C2) , the operand arrives from the cache (assuming a hit in 
the cache) while the micro- instruction to execute the instruction is read out of 
control storage and the working registers in the E unit are set up in preparation of 
the execution of the instruction. The instruction is executed in the next cycle (E) , 
and the results are put away in the follwing cycle. - In general, instructions 
processed by the CPU are dependent on each other. The design of both the instruction 
(I) unit and the E unit implicitly recognizes such dependence to speed up the 
processing of instructions. The Load Bypass is an example of such a mechanism in the I 
unit, while the Wrap to A register and Wrap to B register serve a similar purpose in 
the E Unit. - The purpose here is to disclose a mechanism that explicitly recognizes 
when certain frequent, mutually dependent instructions are processed by the I unit, 
creates a 'super instruction' that does the same work as the instruction sequence it 
replaces, and executes the super instruction in the E unit in a fewer number of cycles 
than the original instruction sequence. For some pairs this is accomplished without 
changes in the E unit except for changes in microcode. - There is no change in the 
sequence or number of instructions processed by the I unit or the number of cycles 
taken by the I unit. For some pairs there is no change in the total amount of work 
done by the E unit or the sequence in which it is done (except that it is done in 
fewer cycles) . Thus, there should be no problems with architecture specified orders or 
instruction retry. - The concept is illustrated below in the context of a frequent 
pair of instructions: (TM, BC(R)), where TM is Test under Mask and BC is Branch on 
Condition, though it will work equally well for (CLI BC(R)), (LTR BC(R)), and other 
pairs. :BC is decoded, then a super-instruction can replace the instruction sequence 
(TM, BC) . This is done by reading out a micro-instruction (s) that corresponds to the 
super-instruction rather than just the TM. The :BC Successful signal is generated 
during the execution cycle of the TM instruction (early generation of BC Successful) ; 
thus, the execution cycle of the BC is a null cycle. Thus, in this case the 
super- instruction corresponds to a micro- instruction sequence that does the work 
currently being done for TM; the execution cycle activity for BC, having been done 
during the execution cycle of TM, is eliminated. - The concept of super- instruction is 
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described in greater depth below and specific implementation is provided. The loop in 
Fig. 1 occurs frequently in scientific/ engineering applications. One possible 
execution is shown in Fig. 2 (D, A, C, E, P, B refer respectively to Decode, Address 
generation, Cache access, Execute, Putaway and BXLE preexecution in the I Unit) . - 
Even with the reduced E times, the multiply-add loop is still E unit limited. A 
further improvement in E time is obtained by eliminating the separate LD cycle in the 
E unit. This is accomplished as follows: If the I unit, on successive cycles, decodes 
the two instructions: LD 0,X MD 0,Y It creates a new super- instruction MD* 0, FLB1, 
FLB2 . This instruction is written over the LD instruction in the queue between the I 
and E units. A new microstore address is provided and during set-up, FLB1 and FLB2 are 
gated to A and B. The multiply proceeds as usual with the result going to floating 
point register 0. The same scheme can be used with additional pairs, such as (LD, AD), 
(LD, SD) , etc. - A typical timing sequence with the LD overlap is shown in Fig. 3. An 
LD is decoded in cycle 7 and placed in the queue. This location would normally not be 
active until cycle 13. During cycle 8, MD is decoded and the potential for a 
super-instruction is recognized. The earlier LD information in the queue is replaced 
by MD'. In cycle 13, a new micro-instruction is read out and the I unit causes the 
transfers FLB1 approaches A and FLB2 approaches B. (Two Op buffer copies are therefore 
require It should be noted that by cycle 11 the operands are actually available in the 
E unit. The multiplication continues from cycle 14 to 18 under control of a new 
micro-routine. - A single set of six operand buffers can be utilized in one to one 
relation with the six operand address registers . The buffers are loaded from the cache 
bus. During staging, the I unit can gate the operand buffers to either the A or the B 
staging registers. The proposed super-instruction scheme is shown in Fig. 4, and the 
buffers are loaded simultaneously. Copy 1 of the buffers goes to A-reg and cpu 2 tp 
Breg. The I unit uses appropriate control lines to activate the transfers FLB1 
approaches A and FLB2 approaches B in parallel. Once the hardware penalty is paid, 
various pairs of instructions can be handled with only micro-store impact. - The 
advantages are as follows: 1. Significant performance improvement for sample loops. 2. 
No software impact. 3. Maintains one at a time execution philosophy. Exception 
handling/Instruction retry handled by new micro-routine. - (If necessary the pair can 
be reexecuted sequentially, i.e., with the super- instruction feature switched off.) 4. 
Once the hardware penalty is paid, very little penalty for additional 
super-instructions . 

SECURITY: Use, copying and distribution of this data is subject to the restictions in the Agreement For IBM 
TDB Database and Related Computer Databases. Unpublished - all rights reserved under the Copyright Laws of 
the United States. Contains confidential commercial information of IBM exempt from FOIA disclosure per 5 
U.S.C. 552(b)(4) and protected under the Trade Secrets Act, 18 U.S.C 1905. 

COPYRIGHT STATEMENT: The text of this article is Copyrighted (c) IBM Corporation 1982. All rights reserved. 
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PUB-NO PUB-DATE LANGUAGE PAGES MAIN-IPC 

SU 478307 A August 5, 1975 000 

INT-CL (IPC) : G06F 9/06 

ABSTRACTED -PUB -NO: SU 4783 07A 
BASIC-ABSTRACT: 

This invention concerns computer technology and can be used for controlling the 
operation of a programmed specialized digital computer, or for implementing standard 
subprogrammes in general -purpose digital computers. Simplifications are proposed on 
grounds of economy. Essentially, in the proposed device each output of a distributer 
(1) is connected to the corresponding recording inputs of an operand address decoder 
(5), operand address registers (3, 4), operation code decoder (6), and instruction 
address registers (7, 8). The circuitry also includes output rails (2), outputs 
(9-18) , inputs (19, 20) , and the diode matrices (21-28) . As a result the device no 
longer requires a permanent storage device, and there is a saving of one operand 
address register and one operation code register. If the circuitry is based on logical 
elements, there is an advantage in using magnetic switches which have perforated 
cores . 
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ABSTRACTED -PUB -NO: SU 1269147A 
BASIC-ABSTRACT: 

The circuitry contg. a register (1) at the data input (2), operand memories (3,4), 
mask coders (7,8) and the operand length code input (9), has a clearing input to the 
OR-gate (10), AND-gates (11,14), NOT-gate (13), sync input (15), trigger (16), a byte 
address subtraction control circuit (18), byte address subtractors (19,20), 
multiplexers (24,25) and the decoders (26,27). 

Initially the value of two LSB digits of the external memory addresses for each 
operand are entered in the input register. In performing e.g. the operation of decimal 
addition, the first operand may be 12 bytes in length and have 10 as the external 
memory address; the other operand can be seven bytes in length and have the address 
01. The byte address subtraction control circuit indicates the byte in a work from the 
local memory address of which it is necessary to subtract 1. The byte address 
subtractors modify the byte address of the first and second operands by subtraction of 
1 on address to the memory. Data of variable length can be arranged in integer 
boundaries of words. 

USE/ADVANTAGE - In processors of medium and high productivity computers, operating 
speed is increased in pre-processing of variable-length operands. Bui . 41/7 . 11 . 86 
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BASIC-ABSTRACT: 

NOVELTY - The apparatus includes a short instruction memory (SIM) (105) , SIM fetch 
logic (103) , instruction decoder, very long instruction word (VLIW) instruction memory 
(VIM) (10 9) , VLIW memory address unit (VIM AGU) and associated VIM address registers . 
A VIM address generation mechanism selects a VLIW in a VIM by generating a VIM 
address . 

DETAILED DESCRIPTION - The apparatus includes an instruction register for storing an 
instruction including direct address bits. The VIM AGU determines if the instruction 
in the instruction register is a direct VIM addressing mode instruction and provides 
direct addressing mode control signals to the VIM address generation mechanism. 
INDEPENDENT CLAIMS are also included for the following: 

(a) Base plus index addressing mode apparatus; 

(b) Circular indexed addressing mode apparatus; 

(c) Processing element providing method; 

(d) Processing element selective execution method; 

(e) Synchronous MIMD operation providing method 
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USE - For instruction addressing in indirect very long instruction word processor in 
signal processing application. 

ADVANTAGE - Allows for greater flexibility in opcode space making more bits available 
by the use of register based address modes . Allows synchronous MIMD mechanism for 
selection of different VLIW in each processing element in parallel. 

DESCRIPTION OF DRAWING (S) - The figure illustrates 2 multiply 2 VLIW processor. 
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